Method for determining the existence of a mutation

ABSTRACT

In a method for determining the existence of a mutation in a nucleic acid fragment from an electrical signal generated by a DNA sequencer made up of sequence information produced by fluorescent fragment products of different lengths of said nucleic acid fragment, followed by a run-off peak produced by fluorescent full length fragment products of said nucleic acid fragment, any difference between said run-off peak and a reference run-off peak generated by unmutated full length fragment products of said nucleic acid fragment, is determined, a difference indicating the existence of a deletion or insertion mutation.

This is a 371 of PCT/SE96/01432, filed Nov. 07, 1996.

TECHNICAL FIELD

The invention relates to a method and an apparatus for determining theexistence of a mutation in a nucleic acid fragment from an electricsignal generated by a DNA sequencer and made up of sequence informationproduced by fluorescent fragment products of different lengths of saidnucleic acid fragment, followed by a run-off peak produced byfluorescent full length fragment products of said nucleic acid fragment.

BACKGROUND OF THE INVENTION

DNA sequencing, i.e. determining the sequence of nucleotides in a geneor in a segment of DNA, commonly involves several sequential steps aimedat:

Isolating genetic material from biological material.

Amplifying the gene of interest using polymerase chain reaction (PCR) sothat sufficient material of the gene of interest is available forsequence analysis.

Performing sequencing reactions using the principles of Sanger. Thisstep enzymatically generates a large number of differently elongatedcomplementary copies of the gene. By introduction of base specificelongation terminators, each elongated copy of the gene will terminatewith a specific type of nucleotide. Each reaction corresponds to onespecific type of nucleotide, Adenine (A), Thymine (T), Guanine (G) orCytosine (C), i.e. only one type of elongation terminator, will beincorporated in each reaction. To enable detection of these elongatedgene copies, a fluorescently labelled molecule is introduced in the genecopy during the enzymatic reaction. Thus, all elongated gene copies willbe fluorescently labelled to facilitate their detection.

Separating the mixture of differently elongated complementary copies ofthe gene according to their molecular size using gel electrophoresis.

Sequentially detecting the differently elongated complementary copies ofthe gene, separated according to molecular size during gelelectrophoresis, using a DNA sequencer system, e.g. the DNA sequencermarketed under the trademark ALF by Pharmacia Biotech AB, Uppsala,Sweden. During the electrophoresis, any fluorescent molecules passing aperpendicularly oriented laser beam, will be activated and thefluorescence from each molecule will be detected by light sensitivedetectors, each representing one type of nucleotide in the sample.

Determining the nucleotide sequence by superimposing the signals fromfour detectors representing the four different nucleotides of the samplegene.

An example of such signals obtained from four such detectors is shown inFIG. 1 on the appended drawing. As apparent, the diagram in FIG. 1 isdivided into a sequence region, containing sequence data, and a run-offregion, containing the so called “run-off peak” which is described morein detail below.

Sequence data obtained by processing and sequencing DNA samples fromtumour tissue in an automatic sequencer, can be used to detect inheritedor induced mutations in genes related to the occurrence or progressionof the tumour. When mutated, the sample sequence obtained from thesequencer will often consist of a mixture of two superimposed sequencecomponents, namely the wild type component and a mutated component. Thiscould be due either to a mixture of two cell populations in the sampleor to a mutation in one of the two copies of the gene, if both arepresent in the sample. In cases where the mutated sequence component isthe predominant component, insertion and deletion mutations as well aspoint mutations can be readily detected by aligning the sequence dataobtained from the sample with the expected wild type sequence usingstandard alignment algorithms (see e.g. S. Needleman and C. Wunsch, J.Mol. Biol. 48, 444 (1970)), and W. R. Pearson and W. Miller, Methods inEnzymology, 210, 575 (1992)). Often, however, the mutated sequencematerial is mixed up with an equally large amount of non-mutatedmaterial. In some cases, the non-mutated material will even bepredominant. In these cases ordinary alignment algorithms fail toresolve the mutation.

BRIEF DESCRIPTION OF THE INVENTION

The object of the invention is to bring about a simple and reliablemethod of determining the existance of a mutation in a nucleic acidfragment.

In the method according to the invention for determining the existenceof a mutation in a nucleic acid fragment from an electric signalgenerated by a DNA sequencer and made up of sequence informationproduced by fluorescent fragment products of different lengths of saidnucleic acid fragment, followed by a run-off peak produced byfluorescent full length fragment products of said nucleic acid fragment,this is attained, mainly, by determining any difference between saidrun-off peak and a reference run-off peak generated by unmutated fulllength fragment products of the nucleic acid fragment, a differenceindicating the existence of a deletion or insertion mutation.

This object is also attained by the apparatus according to the inventionfor determining the existence of a mutation in a nucleic acid fragmentfrom an electric signal generated by a DNA sequencer and made up ofsequence information produced by fluorescent fragment products ofdifferent lengths of said nucleic acid fragment, followed by a run-offpeak produced by fluorescent full length fragment products of saidnucleic acid fragment, mainly, in that it comprises means fordetermining any difference between said run-off peak and a referencerun-off peak generated by unmutated full length fragment products ofsaid nucleic acid fragment, a difference indicating the existence of adeletion or insertion mutation.

BRIEF DESCRIPTION OF THE DRAWING

The invention will be described more in detail below with reference tothe appended drawings

FIG. 1 depicts the trace of an electrical signal produced by a DNAsequencer, depicting the Sequence region and the Run-off region.

FIG. 2a schematically shows a normal run-off peak,

FIG. 2b schematically shows a broadened run-off peak, and

FIG. 2c schematically shows a split run-off peak.

PREFERRED EMBODIMENTS

When separating fluorescent DNA fragments of limited size in anelectrophoresis gel and exciting these fragments to fluoresce, e.g. inthe case of direct sequencing of PCR fragments, full length fragmentproducts exist where the polymerase have not incorporated any chainterminators. This results in the presence of a prominent peak, the socalled “run-off peak”, in all raw data signal curves from a sequencer.This run-off peak is located at the end of the actual sequenceinformation.

An example of a normal run-off peak generated by unmutated full lengthfragment products of a nucleic acid fragment, is schematically shown inFIG. 2a. The run-off peak in FIG. 2a may correspond to the run-off peakin the run-off region shown in FIG. 1.

If two species of DNA fragments are present in the sequencing reaction,e.g. one from a tumour tissue and one from surrounding normal tissue,they will generate one run-off peak each.

If the DNA fragments are of equal size, these two run-off peaks will, ofcourse, coincide.

However, in case of a deletion or insertion mutation in one of thefragments, the fragments will differ in size and two separate run-offpeaks will be generated.

A small deletion or insertion will result in a broadening of the run-offpeak.

Such a broadened run-off peak originating from a small insertionmutation, is schematically shown in FIG. 2b. As apparent from FIG. 2b,the normal “unmutated” run-off peak of FIG. 2a has been broadened, sothat the run-off peak of FIG. 2b ends after the run-off peak of FIG. 2a.

In case of a large deletion mutation, a split run-off peak will begenerated as shown in FIG. 2c. As apparent from FIG. 2c, the normal“unmutated” run-off peak coincides with the normal “unmutated” run-offpeak of FIG. 2a, while the run-off peak from the fragments in which alarge deletion mutation is present, will appear before the normal“unmutated” run-off peak.

The indication of insertion and deletion mutations based on the run-offpeak behaviour, is very sensitive and a contribution of less than 5%mutated material can be readily detected. The resolution in sizedifference between the two fragments is however limited to about ±2bases. This is of course dependent on the resolution of theelectrophoresis gel.

It should be understood, however, that when using the run-off peakinformation alone for the mutation assignment, no information about thelocalization of the mutation along the DNA fragment will be achievable.

The run-off peak information is a sensitive means for the mutationdetection and may be used as a consistency check of the mutationassignment derived from sequence data.

In accordance with the invention, any difference between a run-off peakgenerated by mutated full length fragment products, such as the run-offpeak shown in FIG. 2b, and a normal “unmutated” run-off peak orreference run-off peak generated by unmutated full length fragmentproducts, such as the run-off peak shown in FIG. 2a, is determined as anindication of the existence of a mutation.

In accordance with a first embodiment of the method according to theinvention, the difference between the peak width of the run-off peak inFIG. 2b and the peak width of the run-off peak in FIG. 2a, is measuredin view of the fact that the size of the peak width difference isdirectly proportional to the size of the mutation.

If, as in FIG. 2b, the wider peak ends after the normal peak of FIG. 2a,this is an indication of an insertion mutation.

However, should the wider run-off peak begin before the run-off peak ofFIG. 2a, this is an indication of a deletion mutation.

In accordance with a second embodiment of the method according to theinvention, the difference between the run-off peaks in FIGS. 2a and 2 bmay be determined by measuring any difference between the location ofthe peak centers of these run-off peaks.

The peak center of the run-off peak of FIG. 2b is located after thecenter of the run-off peak of FIG. 2a. This is then an indication of aninsertion mutation.

If, as in FIG. 2c, the center of the “mutated” run-off peak is locatedin front of the center of the normal or reference run-off peak, thisindicates the prescence of a deletion mutation.

The size of the difference between the peak centers is directlyproportional to the size of the mutation.

As should be apparent from the above, the method according to theinvention is a simple and reliable method of determining not only theexistence of a mutation in a nucleic acid fragment but also whether themutation is a deletion mutation or an insertion mutation.

The apparatus (not shown) according to the invention for determining theexistence of a mutation in a nucleic acid fragment from an electricsignal generated by a DNA sequencer and made up of sequence informationproduced by fluorescent fragment products of different lengths of saidnucleic acid fragment, followed by a run-off peak produced byfluorescent full length fragment products of said nucleic acid fragment,comprises means (not shown) for determining any difference between saidrun-off peak and a reference run-off peak generated by unmutated fulllength fragment products of said nucleic acid fragment, a differenceindicating the existence of a deletion or insertion mutation.

In a first embodiment of the apparatus according to the invention, themeans (not shown) for determining any difference between said run-offpeak and said reference run-off peak, is adapted to measure anydifference in peak width between these run-off peaks, a wider run-offpeak ending after said reference run-off peak, indicating an insertionmutation, while a wider run-off peak beginning before said referencerun-off peak indicating a deletion mutation, the size of the peak widthdifference being directly proportional to the size of the mutation.

In a second embodiment of the apparatus according to the invention, themeans (not shown) for determining any difference between said run-offpeak and said reference run-off peak, is adapted tp measure anydifference between the location of the centers of these run-off peaks, alocation of the center of said run-off peak in front of the center ofsaid reference run-off peak indicating a deletion mutation, a locationof the center of said run-off peak after the center of said referencerun-off peak indicating an insertion mutation, the size of saiddifference being directly proportional to the size of the mutation.

The apparatus according to the invention is preferably implemented incomputer software.

What is claimed is:
 1. In a method for determining the existence of amutation in a nucleic acid fragment from an electric signal generated bya DNA sequencer and made up of sequence information produced byfluorescent fragment products of different lengths of said nucleic acidfragment, followed by a run-off peak produced by fluorescent full lengthfragment products of said nucleic acid fragment, wherein the improvementcomprises determining any difference between said run-off peak and areference run-off peak generated by unmutated full length fragmentproducts of said nucleic acid fragment, a difference indicating theexistence of a deletion or insertion mutation.
 2. Method according toclaim 1, characterized in that said difference between said run-off peakand said reference run-off peak is determined by measuring anydifference in peak width between these run-off peaks, a wider run-offpeak ending after said reference run-off peak, indicating an insertionmutation, while a wider run-off peak beginning before said referencerun-off peak indicating a deletion mutation, the size of the peak widthdifference being directly proportional to the size of the mutation. 3.Method according to claim 1, characterized in that said differencebetween said run-off peak and said reference run-off peak is determinedby measuring any difference between the location of the centers of theserun-off peaks, a location of the center of said run-off peak in front ofthe center of said reference run-off peak indicating a deletionmutation, a location of the center of said run-off peak after the centerof said reference run-off peak indicating an insertion mutation, thesize of said difference being directly proportional to the size of themutation.